System and method for message processing and routing

ABSTRACT

A message routing system that allows applications at either end of the system to run as-is without modification. The system functions in a multithreaded environment and is capable of handling complex routing rules and message transformation. It is also capable of learning and executing new routing rules and message transformations in formats previously unrecognized by the system. The system enables precise and reliable logging of messages throughout processing and supports publication of enterprise-wide broadcast messages. The system further preferably employs cooperating inbound and outbound transport processes for consuming, routing, processing, safely storing and publishing messages in batches of logical units of work to ensure that the logical units of work are not lost in system transactions. The system also preferably utilizes a replay server for preserving and replaying messages that might otherwise fail to reach their intended destinations.

FIELD OF THE INVENTION

The present invention relates to a messaging system and method forprocessing and routing messages in a computer network environment.

BACKGROUND OF THE INVENTION

In a computing environment where large amounts of data are moved betweenvarious locations, for example in connection with stock trading, it isdesirable to move the data as efficiently as possible. One early methodfor doing so, as illustrated in FIG. 1, was to transfer the data from amain data source 100 as a whole data file 102 via File Transfer Protocol(FTP) to routers 110, 112, 114 located in different areas where the datawould need to be distributed. (The geographic locations noted in FIG. 1are for illustrative purposes only, to show how widely dispersed thedata destinations may be.)

Each of the routers 110, 112, 114 contains a local network file serverthat parses the data file 102 and generates a plurality of smaller datafiles 116, which are distributed to local destinations 120 a, 120 b, 122a, 122 b, 124 a, 124 b. The number of local destinations shown in FIG. 1can be any number of destinations that need to access data from the file102.

There are two major disadvantages to the arrangement shown in FIG. 1.First, the data is not sent in real time, leading to an undesired delayin processing the data. Second, the entire data file 102 had to be sentto multiple locations 110, 112, 114 in order to be distributed to theultimate destinations 120 a-124 b, resulting in large amounts ofunnecessary computer network traffic. Because of these disadvantages,the data file 102 was actually parsed and divided multiple times, asopposed to as few as once, thereby creating a process that wasinefficient, processor intensive, and not in real time.

In a setting like stock trading, access to data in real time is criticalin order to be able to make the best possible trades at a given point intime. In an effort to overcome the inefficiencies using an FTP-baseddata transfer, a similar arrangement was used on top of a messagingplatform which could distribute the data in real time, as shown in FIG.2.

Modern computer networks are rarely homogeneously constructed; they areoften a collection of old and new systems from a variety of vendors andoperate on a variety of platforms. Across an enterprise, it is criticalthat the disparate parts of a computer network communicate with eachother in some form. One solution to this problem is to utilize amessaging platform that runs across various systems while providing acommon message format. A common messaging platform typically involves apublish-subscribe metaphor, in which information is published to aparticular subject or topic, and any party interested in receiving thatinformation subscribes to that subject (this may also be referred to asconsuming off a particular subject). In this environment, a consumeronly receives information that is of interest; any other, non-relevantinformation is not published to the subject. Examples of such amessaging platform include ETX from TIBCO Software, Inc. and as MQSeries from International Business Machines Corporation.

To route the data to its final destination, it must be published to asubject that the destination subscribes to. Since there is some overheadin terms of time in determining the proper subject on which to publish amessage, a message can be published to a “general” subject and thespecific subject of the message can be determined thereafter. Onesolution to this problem is to use a router to examine the message andto determine the specific topic on which the message should bepublished.

As shown in FIG. 2, a data source 200 publishes messages 202, all ofwhich are consumed by a general data router (GDR) 210. The router 210parses the messages 202 and publishes the parsed messages on newsubjects 212, 214, 216, which are destined for second-level routers 220,222, 224, respectively. The second-level routers 220, 222, 224 examinethe message a second time, and republish the message on a specificsubject 226 for a particular end destination 230 a, 230 b, 232 a, 232 b,234 a, 234 b.

The router 210 parses a message 202 by examining the contents of themessage 202, evaluating a particular key contained within the message202, and based upon the value of the key, determines the propersecond-level router 220, 222, 224 to which it should publish the message202. The second-level routers 220, 222, 224 examine the message in thesame manner as the router 210, but with a finer level of granularity, inorder to determine the specific destination 230 a-234 b for the message.Simply stated, the message 202, when published, does not have adestination address associated with it, but that address can be builtdynamically by the routers 210 and 220, 222, or 224, by looking up whatis in the message 202, building the address for the message 202, andpublishing the message 202 to its final destination 230 a-234 b.

One of the goals in using a messaging platform and the multiple routersis to extract some of the complexity from both the publisher and theconsumer and placing that logic into a centralized layer, such that itis essentially considered by both end publishers and end consumers to bepart of the messaging platform. This is one of the focus points ofenterprise application integration (EAI), making it easier for disparatesystems to communicate with one another. By placing the routing logic ina centralized location, the administration of the logic is simplified,since only one location needs to be updated when changes are made.

In order to simplify what a particular second-level router 220, 222, 224needs to understand, it can be specified what is unique about aninstance of the application that can be found in the message. But thereis still the problem, from the publisher's (200) perspective, of how toidentify which specific destination 230 a-234 b to send the message. Ina publish-subscribe environment, this problem is solved by publishing toa subject subscribed to by the specific destination. If the router 210was not present, each of the second-level routers 220, 222, 224 wouldneed to discard any messages that were not intended for them; this wouldmerely replicate one of the disadvantages of using FTP as noted above,but in connection with a messaging platform. The router 210 helps toreduce the amount of unnecessary data traffic by reducing the number ofmessages that need to be sent. Ideally, no message is duplicated, nor isa message sent to more than one location.

One disadvantage of this use of the messaging platform is that there aremultiple instances of routers operating at the same time, which createsmanagement issues of having to coordinate several pieces of software.While the routers are executing the same code base, each router isapplying different routing rules, depending upon the router's locationin the message flowpath. Furthermore, each router is only able to applyone routing rule. To apply multiple routing rules to one message,multiple routers need to be arranged in sequence, necessarily creating acomplicated network design. The design shown in FIG. 2 is also a singlethread of execution, which limits the throughput of the routing systemto about 35 messages per second (assuming an average message size of twokilobytes). In the example noted above of a large stock trading system,a real-time flow of data easily exceeds 35 messages per second.

It is desirable to create a routing system that utilizes a singleapplication to execute multiple routing rules on a single message, thatis multithreaded in order to increase the throughput of the system, andis messaging platform agnostic such that disparate messaging platformscan be used on either side of a publish-subscribe or a point-to-pointtransaction.

FIG. 3 shows how a single router of the prior art operates whileprocessing a message. A router 300 accepts an inbound message 302,processes the inbound message 302 and outputs an outbound message 304.The contents of the inbound message 302 and the outbound message 304 aregoing to be identical. The goal of the router 300 is to examine thecontents of the inbound message 302, which is published to a generalsubject, and from those contents determine the specific subject on whichthe outbound message 304 should be published for consumption by theultimate recipient of the outbound message 304.

The inbound message 302 is first examined at block 310, where anintrospection module is called. The particular introspection module tobe called is dependent upon the subject of the inbound message 302 andis retrieved from an introspection module library 312. An introspectionmodule (a/k/a key extraction routine) is a customized routine thatcomplies with a particular interface. It can be loaded dynamicallyaccording to a configuration of a particular routing instance and itcontains the logic for examining a specific type of message. This codewill read the inbound message 302 and extract the information needed todetermine how to route the message 302 to the proper specific subject,namely a routing key. The information to be extracted and used as therouting key is defined in the introspection module, which is why adifferent introspection module is required for each different routingrule to be applied. For example, in the stock trade example, the accountnumber associated with the trade can be used as the routing key.

At block 320, the routing key is extracted from the inbound message 302and the value of the routing key is evaluated. This value is matchedagainst a keymap table 322 to determine the routing tag or target forthe inbound message 302. The keymap table 322 is a two column table thatlists the values of the routing key in one column and the matchingrouting tags for those values in another column. Because the router 300can only operate on one routing rule, the keymap table 322 will be thesame for all inbound messages 302. The data in the keymap table 322 canbe cached locally within the router 300 for rapid access to the data.During the initialization of the router 300, the keymap table 322 isloaded into the router's memory from an external routing informationdatabase 324.

Once the routing tag of the inbound message 302 has been identified, atblock 330, the routing tag is used to access an outbound routing table332 to identify the outbound subject for the inbound message 302. Theoutbound routing table 332 is a two column table that lists the valuesof the routing tag in one column and the outbound subjects for thosevalues in another column. As with the keymap table 322, the outboundrouting table 332 can be cached in local memory during theinitialization of the router 300 by loading the outbound routing table322 from the routing information database 324. In block 340, the inboundmessage 302 is published to the new subject as outbound message 304.

FIG. 4 shows how the prior art applied multiple routing rules to asingle inbound message 400. Because each router of the prior art wasonly capable of applying a single rule, it was necessary to stringmultiple routers together to be able to apply multiple rules to a singlemessage. (The concept of multiple routing rules will be discussed belowin connection with FIG. 5.) As shown in FIG. 4, an inbound message 400is examined by a first router 410, which applies a first rule to theinbound message 400 and then, if the inbound message 400 meets thecriteria of the first rule, publishes the inbound message 400 as anoutbound message 412 for a first consumer 414. The inbound message 400is then passed to a second router 420, which applies a second rule tothe inbound message 400 and then, if the inbound message meets thecriteria of the second rule, publishes the inbound message 400 as anoutbound message 422 for a second consumer 424, and so on.

Some solutions to the general problems posed by the complexities ofenterprise application integration have been proposed by various U.S.patents. For example, U.S. Pat. No. 6,256,676 to Taylor et al. relatesto a system for integrating a plurality of computer applications,including an adapter configured for each of the applications, theadapter controlling the communication to and from the associatedapplication. The system of Taylor et al. permits communication across avariety of different messaging modes, including point-to-point,publish-subscribe, and request-reply messaging, utilizing messagedefinitions for each type of object to be passed through the system. Anumber of different types of adapters are required for each application,and for each message definition. While the architecture of this systempermits flexibility in system construction, it requires a significantamount of work by the user to properly construct the system. This systemadapts to the applications to be connected, rather than requiring theapplications to adapt themselves to the system.

U.S. Pat. No. 5,680,551 to Martino, II describes a system for connectingdistributed applications across a variety of computing platforms andtransport facilities. To implement this system, it is necessary tomodify each of the applications to be connected to include the basicoperating core (i.e., the application programming interface) of thesystem. This system does not support a publish-subscribe messagingplatform, and any application desiring to receive messages must activelyseek out new messages. In order to use this system, a messaging userinterface to each application is designed, then the messaging system isintegrated into each application to be connected, and finally the systemis configured and tested. Following these steps for each application tobe connected is both labor-intensive and time-intensive.

In regard to content processing and routing, U.S. Pat. No. 6,216,173 toJones et al. discloses a method and apparatus for incorporating suchintelligence into networks. The system of Jones et al. associatesattributes with each service request which allows the system to obtainknowledge about the content and requirements of the request. Using thisknowledge, along with knowledge of the available services, the systemcan route the request to a suitable service for processing. This systemalso permits communication across disparate networks, by converting thedata for transmission across each type of network. The conversionprocess occurs while the data is being sent from, for example, Node A toNode C. An intermediate stop is made at Node B to convert the data fromthe format at Node A to the format at Node C. The data conversion occursduring the routing process, not once routing is completed.

While these patents address various problems existing in the prior art,none contemplate use of a single application to handle all of therouting, allowing the applications at either end of a publish-subscribeor a point-to-point messaging system to run as-is without modification,and to run in any messaging environment regardless of the specifics ofthe messaging platform (i.e., to be messaging system agnostic).

SUMMARY OF THE INVENTION

The present invention provides an efficient routing system and methodthat runs in any publish-subscribe or point-to-point messagingenvironment regardless of the specifics of the messaging platform andthat allows applications at either end of the routing system to runas-is without modification. The system functions in a multithreadedenvironment and is capable of handling complex routing rules and messagetransformation. It is also capable of learning and executing new routingrules and message transformations that may be required by new users ofthe system whose message consumption requirements may be in formatspreviously unrecognized by the system. The system enables precise andreliable logging of messages throughout processing and supportspublication of enterprise-wide broadcast messages. The system furtherpreferably employs cooperating inbound and outbound transport processesfor consuming, routing, processing, safely storing and publishingmessages in batches of logical units of work to ensure that the logicalunits of work are not lost in system transactions. The system alsopreferably utilizes a replay server for preserving and replayingmessages that might otherwise fail to reach their intended destinationsbecause of router or application error or failure.

BRIEF DESCRIPTION OF THE DRAWINGS

For a better understanding of the present invention, reference is madeto the following detailed description of an exemplary embodimentconsidered in conjunction with the accompanying drawings, in which:

FIG. 1 is a diagram showing a prior art data transfer system operatingunder File Transfer Protocol;

FIG. 2 is a diagram showing a prior art data transfer system operatingas a single-threaded application on a messaging platform;

FIG. 3 is a flow diagram of a prior art router, showing how the routerprocesses a message;

FIG. 4 is a block diagram showing how the prior art applied multiplerouting rules to a single message;

FIG. 5 is a flow diagram of a routing system constructed in accordancewith the present invention;

FIG. 6A is a diagram of a first embodiment of a message replay scheme ofthe routing system according to the present invention;

FIG. 6B is a diagram of a further embodiment of a message replay schemeof the routing system according to the present invention;

FIG. 7 is a diagram of a first embodiment of a message transactionmanagement scheme of the routing system according to the presentinvention;

FIG. 8 is a diagram of a further embodiment of a message transactionmanagement scheme of the routing system according to the presentinvention;

FIG. 9 is a diagram of a first portion of the further embodiment of amessage transaction management scheme of FIG. 8, in particular, apreferred multithreaded process for each inbound transport capable ofrunning a consuming thread for each inbound topic/queue;

FIG. 10 is a diagram of a second portion of the further embodiment of amessage transaction management scheme of FIG. 8, in particular, apreferred multithreaded process for each outbound transport capable ofrunning a publishing thread for each source topic/queue;

FIG. 11 is a simplified schematic diagram depicting the manner by whichthe routing system according to the present invention achieves fullyscalable multithreaded, multi-topic message consumption, processing andpublication; and

FIG. 12 is an overview of the message routing and transformationfunctions of the of the routing system according to the presentinvention.

DETAILED DESCRIPTION OF THE INVENTION

Referring now to FIG. 5, the routing system of the present inventioncomprises a router 500 that accepts or consumes an inbound message 502,processes the inbound message 502 and outputs one or more outboundmessages 504. The router 500 examines the contents of the inboundmessage 502, which is published to a general subject, and from thosecontents determines the specific subject(s) on which the outboundmessage(s) 504 should be published for consumption by the ultimaterecipient(s) of the outbound message(s) 504. Although described hereinas it might be used in connection in a publish-subscribe messagingenvironment, the routing system and method of the present invention alsofinds beneficial application in a point-to-point messaging environment.

Multithreaded Execution

The router 500 preferably operates in a multithreaded environment. For arouter to be able to operate as a multithreaded application, theunderlying messaging platform must also be multithreaded. In the priorart, as discussed above in connection with FIG. 3, the messagingplatform was operating on only a single thread of execution. In suchcircumstances, in order to achieve a higher throughput of messages, itwas necessary to instantiate a plurality of routers, each running as aseparate application, i.e., threading by instance. However, as thenumber of instances of the router application concurrently executingincreases, the overhead associated with managing all of those instancesbecomes complicated, and ultimately, the performance of the overallsystem will suffer due to the excessive overhead.

It would be preferable to thread the router in a multithreadedarchitecture, whereby multiple threads would be operating in the sameprocess space, lowering the overhead required to manage multipleconcurrently executing threads. The messaging platform on which thepresent invention executes should be a multithreaded and at least theclient library of the messaging platform multithread-safe. But, having amultithreaded architecture does not necessarily mean that the systemcannot be also threaded by instance to increase the overall throughput.

The router 500 may operate, for example, on an ETX 3.2 or other ETXmessaging platform from Tibco Software. However, at this juncture itshould be made clear that while the present invention is described inconnection with an ETX messaging platform it may also find beneficialuse with other multithreaded messaging platforms as well, including,without limitation, the IBM MQ Series messaging platform. Indeed, aswill be described in greater detail later herein, the present system iscapable of accommodating messages that are published and consumed bydisparate messaging platforms.

Continuing, when the client library of a messaging platform (the actualportion that communicates with a broker/node) reaches maximum throughputcapacity of approximately ten threads, the performance of the routereventually begins to slow down due to the thread management overhead.When such a condition is reached, it may be necessary to create anotherinstance of the router 500 in order to handle the message traffic. Oncethe new instance of the router 500 is created, the message traffic canbe distributed between the multiple instances of router 500 to maximizethe throughput of all of the instances presently running.

The maximum throughput of an ETX node is approximately 200 messages persecond (again, assuming an average message size of two kilobytes). Whenthat threshold is reached, it would be necessary to have more than onenode/broker running. On the other hand, if maximum throughput of arouting instance has been reached, e.g., multiple nodes operating at ornear capacity on a single routing instance, it would be necessary toinstantiate additional instances of the router. In this manner, layersof transport brokers/nodes and routing instances can be added to reach adesired performance quota, which is then only limited by physicallimitations such as machine, hardware, or network bottlenecks thatcannot be circumvented without buying new equipment. In a preferredembodiment, the desired throughput for the system is approximately 150messages per second (again, assuming an average message size of twokilobytes), which should sufficiently perform on one ETX node.

An additional problem encountered when dealing with a singly-threadedrouter is that each instance of that router operates in the same manner.By definition, this is what would occur if multiple instances of thesame application were used; each instance would be expected to operatein the same manner. The key issue with that is, apart from the fact thatthere are several different application processes to manage, that all ofthe process are essentially performing the same operations. Each processis potentially caching the same routing data and each process is, againby definition, applying the same business logic for routing messages.This becomes problematic when the user wants to change an aspect of therouting, because there are several processes that need to be changed inorder to do so.

The real difficulty arises in coordinating those changes across all ofthe different processes, because all of the processes need to be in aconsistent state at all times to avoid an error condition. In otherwords, if a message is in the middle of being processed and the routerthat is performing the processing is updated, a routing error may occur.Because multiple applications may be involved and/or dependent upon asingle message being processed in a particular way, it is necessary toensure that all of the applications relying on that message operate in aconsistent manner. Attempting to coordinate several disparateapplications can be difficult on its own because there needs to be somesort of management protocol involved in the communication between theapplications. Even though each different process space is executing thesame application, there is nothing that binds those process spacestogether.

By utilizing a multithreaded architecture, the method of making changesto the system is simplified by having only one location where thechanges need to be made, and those changes can be propagated to theother threads of execution. Furthermore, the overall system architectureis neater in the context of managing multiple instances of the samerouting logic, and perhaps more importantly, not having to managemultiple instances of the routing data. For example, if there is a largecache associated with the routing logic in each instance of the router,the cache would need to be instantiated the same number of times asthere are routers, because each router would be operating in a separateprocess space. However, if the router were multithreaded, the cachewould only need to be instantiated once for each router, therebyminimizing the overhead associated with managing multiple instances ofthe cache.

Referring back to FIG. 5, the inbound message 502 is first examined atblock 510, where an introspection module or key extraction routine iscalled. The particular introspection module to be called is dependentupon the source of the inbound message 502 and is retrieved from anintrospection module library 512 and dynamically loaded based upon therouting configuration of a particular routing instance. As mentionedpreviously, an introspection module is a is a customized routine thatcontains the logic for handling a specific type of message. This codewill read a message and extract the information needed to determine howto route the inbound message 502 to the proper specific subject, namelya routing key. When the router 500 is applying multiple routing rules toa single inbound message 502, different key extraction routines might beevoked multiple times in sequence. The implementation of how the router500 handles multiple routing rules will be discussed in greater detailbelow.

At block 520, a routing key is extracted from the inbound message 502,and the value of the routing key is evaluated. This value is matchedagainst a keymap table 522 to determine a routing tag for the inboundmessage 502. The keymap table 522 is a two column table that lists thevalues of the routing key in one column and the matching routing tagsfor those values in another column. The data in the keymap table 522 iscached locally within the router 500 for rapid access to the data. Whenthe introspection module is loaded from the introspection module library512, the keymap table 522 is loaded into the memory of the router 500from an external routing information database 524.

Once a routing tag for the inbound message 502 has been identified, atblock 530, the routing tag is evaluated at block 540 to determinewhether the routing tag is bound to a publication/outbound subject,another rule or both. If the tag is bound to a subject, then control ispassed to block 550, where the subject is used to access an outboundrouting table 552 to identify the outbound subject for the inboundmessage 502. The outbound routing table 552 is a two column table thatlists the values of the routing tag in one column and the outboundsubjects for those values in another column. As with the keymap table522, the outbound routing table 552 is cached in local memory when theintrospection module is loaded from the introspection module library 512by loading the outbound routing table 552 from the routing informationdatabase 524. Once the outbound subject has been retrieved at block 550,the inbound message 502 is published to the new subject as an outboundmessage 504.

If the routing tag evaluated at block 540 is not a subject, it must beanother routing rule to be applied to the inbound message 502. Controlis then passed back to block 520, where the inbound message is evaluatedagainst the next rule in a similar manner as previously described. It isthrough this type of evaluation mechanism that multiple routing rulescan be applied to a single inbound message 502, and thereby produce oneor more outbound messages 504. The process from block 520 through block540 is repeated for each routing rule that is contained in theintrospection module. The router 500 is designed to be flexible, in thatan end user of the router 500 has great latitude in configuring how therouting rules operate and how they are applied. Cascading routing ofthis sort overcomes the problem of the prior art, which would haverequired the use of multiple routers to apply multiple rules to a singlemessage.

It is possible to build additional functionality into the router 500that would permit the router 500 to automatically extract the necessaryrouting keys from the inbound message 502. For instance, an inboundmessage 502 could be in a pre-defined format supported by router 500.Thus, an introspection module for that pre-defined format would not benecessary, since the router 500 would have the logic built-in to be ableto parse that type of inbound message 502. In these circumstances, apublisher of a message in the pre-defined format would need to providethe routing tags used within the message format to represent the keyvalues for that publisher's messages.

The router of the present invention assumes that the system designer hasarchitected the enterprise network in such a way as to make the best useof the router and the system bandwidth. While the router has sufficientintelligence to route messages to various destinations, it cannotdetermine if there is a more efficient method of doing so. The router isreinforcing an underlying premise in the content-based routing arena,which is that a publisher does not send any information that is notrequired to any one consumer. So a publisher wants to be completelyabstracted from who the consumers are, but a consumer does not want tohave to throw away messages that it is not interested in.

The consumer only wants to receive messages that are of interest to it,without having to worry about any other messages. By definition, thismeans that when a message is published to a particular subject, thatmessage is of complete interest to a consumer of that subject.Therefore, it is imperative upon the system architect to properly designthe system to make the most efficient use of the available bandwidth.The router is completely agnostic to the architecture, in that it willfunction in the same manner regardless of the system it is utilized in.

From a general perspective, it is desirable to place the message routingas close to the publisher and as far from the consumer as possible. Insuch circumstances, message introspection becomes important, because amessage can be initially published to a general subject, and then afterthe introspection occurs, can be published to the specific subjectdesired by a consumer. The driving concept behind placing the routinglogic close to the publisher is to dispatch the message to its finaldestination as quickly as possible, thereby maximizing the efficiency ofthe overall network. The fewer times a single message is published tosomewhere that is not its final destination, the less network trafficthere is, and therefore, the network becomes more efficient.

Routing Example

The following example illustrates how the router of the presentinvention handles complex routing rules. In this example, the consumingtopic is called US_AUTOMOBILES, and all messages in this topic areformatted using Extensible Markup Language (XML). The content of eachmessage describes different makes, models, and characteristics of somecommon U.S.—produced automobiles and light trucks. The content of themessages shown in Table 1 below is provided to show the flexibility ofthe router of the present invention, and in no way reflects the actualattributes of any vehicle produced. TABLE 1 Sample Messages. InboundMessage Sequence Message Content 1<msgClass>cars<make>chevrolet<style>sportUtility<model>blazer<color>blue<driveTrain>4wd<engine>V6 . . . 2<msgClass>cars<make>chevrolet<style>sportUtility<model>blazer<color>red<driveTrain>2wd<engine>V6 . . . 3<msgClass>cars<make>dodge<style>sportUtility<model>durango<color>red<driveTrain>4wd<engine>V8 . . . 4<msgClass>cars<make>dodge<style>sportUtility<model>durango<color>green<driveTrain>2wd<engine> V6 . . . 5<msgClass>cars<make>ford<style>sport Utility<model>explorer<color>blue<driveTrain>2wd<engine> V6 . . . 6<msgClass>cars<make>ford<style>sportUtility<model>explorer<color>green<driveTrain>4wd<engine> V8 . . . 7<msgClass>cars<make>ford<style>pickup<model>f250<color>red<driveTrain>4wd<engine>V8 . . . 8<msgClass>cars<make>dodge<style>roadster<model>viper<color>blue<driveTrain>2wd<engine>V10 . . . 9<msgClass>cars<make>chevrolet<style>gt<model>z28<color>white<driveTrain>2wd<engine>V8 . . . 10<msgClass>cars<make>chevrolet<style>pickup<model>1500<color>silver<driveTrain>4wd<engine>V6 . . . 11<msgClass>cars<make> chevrolet <style>roadster<model>corvette<color>green<driveTrain>2wd<engine> V8 . . .

Table 2 below depicts the various routing scenarios in this example thatare to be applied to the messages shown above in Table 1. TABLE 2Routing Scenarios. Number Scenario 1 Destination A wants V8 poweredvehicles 2 Destination B wants pickups with 4wd 3 Destination C wants gtcars and roadsters 4 Destination D wants green sport utility vehicleswith 4wd and V8 engines 5 Destination E wants red vehicles with 2wd

Based upon the routing scenarios shown in Table 2, the following tableshows the routing rules that exist in the router to be able to satisfyeach scenario. TABLE 3 Routing Rules. Engine Style DriveTrain ColorDestination tag tag tag tag A V8 B pickup 4wd C gt OR roadster D V8Sport 4wd green Utility E 2wd red

When applying each of the rules, all of the conditions specified by therule must be satisfied in order for a message to be sent to a particulardestination. This is an example of nested routing. Applying these rulesto the inbound messages shown in Table 1 leads to the following results.TABLE 4 Routing Results. Destination Messages Received A 3, 6, 7, 9, 11B 7, 10 C 8, 9, 11 D 6 E 2

When each rule shown in Table 3 is applied to a message in Table 1, themessage is evaluated on a tag-by-tag basis to determine if there is amatch. When the rules are nested (AS they are for all destinationsexcept Destination A), all of the conditions specified by the rule mustbe met in order for a message to be published to the destination. Asshown in Table 4, it is possible for the same message to be published tomultiple destinations (i.e., Messages 6, 7, 9, and 11) and it is alsopossible that some messages may not be published at all (i.e., Messages1, 4, and 5).

Message Replay

Large national and international businesses may publish and consumemillions of electronic messages per day. In many businesses (such as,for example, brokerages involved in electronic financial and equitiestransactions), it is imperative that the transactions be processed on afirst-in, first-out (FIFO) basis. According to a preferred embodiment,the routing system according to the present invention can provide suchFIFO transaction processing. As reflected in FIGS. 6A and 6B, this canbe done in two ways.

FIGS. 6A and 6B show overviews of preferred embodiments of messagereplay procedures that may be executed by the routing system accordingto the invention. As seen in each if those figures, at least onepublisher 600 publishes “primary topic” messages 602 to a router 610.The router 610 processes the messages 602 and publishes the messages toa first topic (Topic 1), a second topic (Topic 2), up to an Nth topic(Topic N), the total number of topics being flexible, as in anymessaging system. The topics are subscribed to by a first consumer 620,a second consumer 622, up an Nth consumer 624, with the total number ofconsumers also being flexible. It will be understood that there may notnecessarily be a one-to-one correspondence between topics and consumers,although it is illustrated herein as such for simplicity of illustrationand description. As used herein, the terms “consumer(s)” and“subscriber(s)” are interchangeable and refer to the destinations towhich outbound messages are published by the routing system of thepresent invention.

The system illustrated in FIGS. 6A and 6B additionally includes a replayserver 630. The replay server is a “super consumer” that acts as asource of data capture. It receives and stores all “primary topic”messages on Topic 1, Topic 2, . . . , Topic N that are published by therouter 610 and it may be prompted from time-to-time to replay certainones of those messages. Thus, if something happens downstream betweenthe router 610 and a consumer 620, 622 and/or 624 that causes messagedelivery problems (for example, if the routing logic is flawed or ifanother application drops messages), the system according to theinvention enables the lost messages to be recovered and redelivered totheir proper destinations such that the recovered or “recovery topic”messages can be processed in FIFO fashion by their intended consumers.As depicted in FIGS. 6A and 6B, the recovery topic messages arepreferably encoded by either the router 610 or the replay server 630 insuch a way that interested consumer(s) recognize them as recovery topicmessages rather than as original publications of primary topic messages.This encoding is reflected in FIGS. 6A and 6B by the addition of a primesymbol (′) to the primary topics Topic 1, Topic 2, . . . , Topic N,i.e., recovery topic messages comprise the messages on Topic 1′, Topic2′, . . . , Topic N′.

It is important to note that in addition to allowing a user of thesystem to get messages re-published to it, the replay server 630actually strips certain metadata tags, defined by the user, from themessages. This metadata is stored in the replay database as columnardata along with an image column that represents the message. This allowsthe users to make so called “smart” queries against a replay graphicaluser interface (“GUI”) to determine what part (subset) of a message flowthey want to be re-sent.

A first message recovery scenario is shown in FIG. 6A and may begenerally referred to as “router recovery.” As described below, routerrecovery might be deployed on a large scale to recover large amounts ofdata that might be lost because of harm to the communicationsinfrastructure of a business unit of a distributed enterprise.Alternatively, when a consumer is an end user application, routerrecovery might also be used to recover on all topics subscribed to bythat application. As depicted in FIG. 6A, when router recovery isdesired, a consumer 620, 622 and/or 624 sends a replay request 640 torouter 610. Once that request is made the replay server 630 picks up theuser request and the data from the replay data store and republishes thedata on the desired recovery topic through the router 610. In order toconsume the desired messages republished through router 610, the userswitches off consumption on the topic from the router (i.e., switchesoff consumption of primary topic messages) while switching onconsumption on the topic being published by replay server through therouter (i.e., switches on consumption of recovery topic messages) untilthe queue of desired messages is drained from the replay server. Afterhaving consumed the desired recovery topic messages from the recoveryserver 630 through router 610, the consumer switches back to routerconsumption on the primary topic and consumes from the router as it didprior to the recovery request. It will be understood that while aconsumer is requesting and consuming recovery topic messages, the routerotherwise continues to process primary topic messages in the order thatthey were published by a publisher or publishers 600. This methodologyallows a preservation of FIFO ordering.

As far as router 610 is concerned, replay is simply an injection point.That is, the router can publish multiple targets. From the router'sperspective, replay is simply another target (although replay has adedicated adapter in the routing infrastructure that allows direct Javadatabase connectivity (“JDBC”) injection of message images and metadataso that the two are very tightly linked). Simply stated, the userrequests re-transmission, either full or partial based on the replay GUIwhile the router facilitates the replay data injection.

A second message recovery scenario is shown in FIG. 6B and may begenerally referred to as “replay server recovery.” In replay serverrecovery, an application instance of a consumer 620, 622 and/or 624submits a replay request 640 directly to the replay server 630requesting messages on a recovery topic. The requesting consumerapplication instance is then switched to listen for messages from thereplay server 630 on the desired recovery topic Topic 1′, Topic 2′, . .. , Topic N′. During this time the requesting consumer(s) do not consumeprimary topic messages on primary topics Topic 1, Topic 2, . . . , TopicN published by the router 610. When the requesting consumer(s) consumethe recovery topic messages requested from the replay server 630, theapplication instance of the requesting consumer(s) is switched toprimary topic mode whereby it again listens for messages published bythe router 610 on the desired primary topic. Replay server recoveryconsumes less system resources than router recovery since it does notinvolve the router in the recovery process. For this reason, replayserver recovery is a preferred message recovery method in instanceswhere fine-grained message recovery is sought, i.e., recovery of arelatively limited scope or range of messages.

In addition to assuring FIFO transaction processing, the replay serveraccording to the present invention offers other significant benefits todistributed businesses that have facilities in more than one location.For such businesses, the system according to the invention may beadvantageously employed in a peer model wherein the peers of theenterprise are connected by a wide area network (WAN) and wherein eachpeer is symmetrically equipped with a router 610 and a replay server630.

Consider, for instance, a brokerage house having a New York peer whichprimarily brokers transactions on North American stock exchanges, aLondon peer which primarily brokers transactions on European stockexchanges and a Tokyo peer which primarily brokers transactions on Asianstock exchanges. With the present routing system, there is no need for acentralized router through which all of the messages of the enterprisewould have to be routed before being published to their intendedconsumers. Under normal operating conditions, the general data router ofthe New York peer would primarily handle the business transactionsconducted by the North American business units, the general data routerof the London peer would primarily handle the business transactionsconducted by the European business units, and the general data router ofthe Tokyo peer would primarily handle the business transactionsconducted by the Asian business units. In this way, WAN massage trafficis significantly reduced and transactions are settled more quickly thanthey would be if they all had to be first routed through a centralizedrouter.

Additionally, in the peer model herein described, no single router wouldrepresent a potential global point of system failure. In this regard,consider a situation where a division, plant, office or other businessunit of a distributed enterprise suffers debilitating harm by an act ofGod, an act of terrorism or war, or other catastrophe. In that event,the replay server of the peer which includes the damaged business unitpreserves messages published by the damaged business unit prior tooccurrence of the damage. Those messages can be replayed by the replayserver to the general data routers of other peers in the network. Thus,the pre-damage transactions may be successfully processed by the otherpeer(s) in the network. With a messaging system architected as such, theintegrity of all messages published by the damaged business unit priorto the occurrence of the damage can be retained and processed by thesystem.

Broadcast Messages

Any general data router of the routing system of the present inventionmay publish a broadcast message from any publisher who publishesmessages to that router. A broadcast message may be any message that maybe of interest to one or more units or one or more peers of adistributed enterprise or even the entire enterprise itself. A broadcastmessage may be merely informational in nature or it may, as discussedbelow, serve as an automatic trigger event that that causes some otherevent(s) to be undertaken by the recipients of the broadcast message. Inany case, the router applies a business rule to the broadcast messagewhich identifies the message as a broadcast message whereby thebroadcast message is published to all registered listeners on thesystem.

When a general data router in the routing system according to thepresent invention is used in a worldwide securities trading environment,for example, that router may be processing trading data twenty fourhours a day, seven days a week. In order to properly process messagesthroughout the system, there needs to be some logical separator thatsignifies when the end of a business day has been reached. This type ofmessage is called an “end of day” (“EOD”) message and is treated as anenterprise-wide event. For example, in the aforementioned peer model ofa brokerage house having peers in New York, London and Tokyo, EODmessages are sent daily from the those peers indicating the ends ofbusiness days in New York, London and Tokyo, respectively. These EODevents are of interest to every potential consumer connected to thesystem (i.e., all subscribers on all subjects). The router of thepresent invention does not route an EOD message like any other message,e.g., to a particular business unit. Instead, the router broadcasts theEOD message to every possible potential pre-registered consumer that therouter can publish to.

An EOD message is sent by a publisher signifying that any non-EODmessage, e.g., a trade-related message, received by a consumer after theEOD message should be processed on the next business day. This does notmean that the processing of non-EOD messages is delayed until the nextcalendar day; however the EOD message serves as a logical separatorbetween business days. In that way, the EOD message signifies to itsrecipients to begin various batch processes or other end of daysummaries or tasks that need to be performed at the conclusion of abusiness day. In a worldwide securities trading environment, an EODmessage is necessary because if the system is constantly receiving andprocessing trading messages, there is no mechanism for the system to beable to determine when the end of a business day has been reached. TheEOD message can also be used to shut down certain parts of the system ifno further messages will be received by those parts.

Logging

As a message is being processed, there are different levels of loggingthat can be used. Basically, a user can configure the amount of loggingdesired. In other words, as a message comes into the routing software,every time it takes a hop (i.e., comes into the message bus applicationand gets consumed), it gets handed off from there to the routing logic,and from the routing logic it may be handed into some contenttransformation module. There is the ability to make the log entries moregranular, meaning that each step of the progress of a message can belogged. For example, a log entry could read, “Applying Rule #1. Rule #1has been evaluated and the result is such and such a routing tag.”

The reasons for having different levels of granularity is for use in adebugging scenario. If a user has set up some routing logic and is notgetting the expected end result, then there is an error in the routinglogic. However, it is fairly difficult to debug a piece of multithreadedapplication software. It is helpful if the user can read a log thatbasically shows: “The message came in here and went this way and adecision was made at this point and the message went left, not right,”so the user knows that that is the decision point that he or she needsto change. It is possible that a particular rule did not evaluate theway the user expected, because some key that was returned was not whatwas expected. However, in a deployed release, the logging level shouldbe set fairly coarse because of the performance overhead from logging alarge number of events. In a scenario where a user is testing or if theuser is actually in a failure scenario where and trying to determinewhat went wrong, the logging should be as granular as possible.Therefore, the user should have the ability to configure logging withhigh or low granularity.

Logging can be handled in two ways: as a function of a unit of worksynchronously or as a function of a unit of work asynchronously. In apreferred embodiment, an asynchronous approach is used, wherein thelogging messages are sent to a logger program that is responsible forsynchronously logging them through to a file which is ultimately visibleby a human being.

It is possible to insert user logic between where the logging messagesare generated and where they are written to a logging file that wouldpermit the user to map on a certain pattern for a specified type oferror message. It is also possible for the logger program to send ane-mail or a lifeline alert which pages someone. It is possible toassociate a profile of errors with an associated action or reaction tothe logging process to trigger an alert if a serious error comesthrough. Using a notification system of this type allows errors to beacted on in a timely fashion, instead of attempting to trace through alog file to determine why an error occurred.

Transaction Integration

When working in an EAI environment, it is important to be able todetermine whether a transaction has been successfully completed or ifthe transaction has failed. In the case of a transaction failure, it isoften necessary to redo the transaction in order to complete the workinvolved. Some difficulty arises when dealing with multipleapplications, because a transaction needs to be viewed from asystem-wide level in order to be considered to be “complete.” In someinstances, each application in a system may consider its work to becomplete when it finishes its portion of the work and hands the work offto the next application. While this is true, the system as a whole needsto be aware of whether the entire transaction, from start to finish, hasbeen completed.

If there is a transaction failure on a system-wide level (i.e., afailure of a logical unit of work or “LUW”), it is necessary to rollback to the beginning of the transaction so all of the data involved inthe transaction can be recovered and the transaction can be restarted.It is irrelevant in the context of an LUW what percentage of the unit ofwork has failed because it is not possible to recover a percentage of aunit of work. For example, if a message is consumed successfully, butnot processed successfully, that message is lost (i.e., it cannot beretrieved from the messaging bus because the messaging bus discarded themessage once it was successfully consumed) and cannot be re-evaluated.Being able to recover the lost message is significant, and that is whythe control point for the transaction needs to be where the LUW begins.If anything fails between the control point and the commit point for theunit of work (which is guaranteed success of the performance of the unitof work), it is necessary to roll back the entire transaction to thecontrol point so the transaction can be restarted. Placing the controlpoint anywhere other than where the unit of work begins would not permitthe unit of work to be restarted in the event of a failure duringprocessing of the unit of work.

In the present invention, an LUW begins when an inbound message isconsumed by the router, and ends (commits) when the outbound message issuccessfully published. Any action taken on the message in between thosetwo points, whether it is routing the message or transforming themessage, is part of the LUW. If any of those actions fail, the entireunit of work fails, and the process is restarted from messageconsumption by the router. By defining the unit of work in this manner,messages will not be lost if a portion of the unit of work fails. Froman EAI perspective, this definition is important because it would becounterintuitive to the entire EAI paradigm to have components of theenterprise software losing messages by not successfully publishing andconsuming them.

However, when interacting with disparate messaging systems, transactionmanagement is difficult to do because each messaging system has its ownmechanism for knowing when a transaction has been successfullycompleted. For example, if an inbound message is coming from an ETXmessaging bus, and will be published to an IBM MQ Series messaging bus,it is not possible to take the transaction “begin” from ETX andautomatically have the ETX transaction “commit” triggered off of the IBMMQ Series “commit.” As discussed below, the present inventionadditionally provides a guaranteed message transaction management systemwherein a transaction begins when a message is consumed off a messagingbus (e.g., either an ETX or IBM MQ Series bus) and the whole transactionis committed when that message is successfully published to another bus(either an ETX or IBM MQ Series bus).

Referring now to FIG. 7, there is illustrated a simplified guaranteedmessage transaction management system according to the presentinvention. As shown in that figure, a router 700 consumes an inboundmessage 702 at step 710. At this point a “begin” for the transactionrelating to the inbound message 702 is created. Work is performed on themessage 702 at step 712, and the message is published at step 714 as anoutbound message 720. Work may be performed on the message by routing,transformation or both. As the outbound message 720 is published, amessage identifier 730, preferably a sequence number, is put into adatabase 732. Preferably, the outbound messages 720 are temporarilycached and are not published immediately. The messages 720 will bepublished to the outbound messaging bus in a batch, and the batch sizecan be determined either by a certain number of messages in the batch orafter a certain delay between messages being published.

The LUW will be committed when all of the outbound messages 720 in abatch have been published to the outbound messaging bus, and thedatabase 732 will have the message identifier of the last messagepublished. If, between the time that the “commit” is issued on theoutbound messages 720 and the time the “commit” is issued for theinbound messages 702 (and thereby completing the unit of work), there isan error or failure and the inbound messages 702 are not committed, thenthe entire unit of work rolls back to the first inbound message 702 ofthe unit of work. In the event of an error or a failure, when the router700 is restarted, the inbound messages 702 will be consumed a secondtime, beginning with the first message. When the inbound message 702 isto be published as an outbound message 720, the message identifier 730of the current message is compared to the list of message identifiersstored in the database 732. If the current message was previouslypublished, as indicated by the same message identifier 730 alreadyexisting in the database 732, the reconsumed message is discarded and isnot published a second time.

Although described as useful for communicating with ETX and IBMmessaging buses, the system according to the present invention mayaccommodate all types of messaging platforms and buses. That is, theclient library of a particular messaging platform may provide its owntransaction manager or it may use an industry standard known as XAProtocol, which relates to distributed transactions and the coordinationof those transactions. In this way the guaranteed message transactionsystem according to FIGS. 7-10 can successfully execute transactionsregardless of the messaging platforms used by the publishers andconsumers connected to the system.

FIG. 8 generally illustrates a further embodiment of a messagetransaction management scheme according to the present invention andFIGS. 9 and 10 provide specific details thereof. The transactional modelof FIG. 8 differs from that of FIG. 7 in that the work performed on amessage is divided between a consumer process 802 and a publisherprocess 804 (which processes are described in greater detail in FIGS. 9and 10, respectively) in such a way as to assure that messages processedby the system are neither lost nor duplicated by either the consumerprocess or the publisher process when message recovery or replay isrequired. As shown in FIG. 8, inbound messages are consumed by consumerprocess 802 from the messaging bus of a dedicated inbound messaging node800 (e.g., an ETX node). As generally shown in FIG. 8 the messagesconsumed by the consumer process 802 are worked on by the consumerprocess and passed to a file system 808 which is in communication withthe consumer process 802 and publisher process 804. File system 808includes a relational database management system (“RDBMS”) 806 which maybe an RDBMS from Sybase Inc. or other RDBMS vendor. Through file system808, persistent message files, referred to herein as “save store files,”are created and write and read offsets are maintained for messagebatches that are written to the save store files by the consumer processand that are read from the save store files by the publisher process.The details and advantages of such save store files and message batchoffsets are set forth below. As shown in FIGS. 9 and 10, save storefiles are stored in a database 812 of file system 808 that is managed byRDBMS 806. When a batch of messages has been committed to a save storefile, the publisher process 804 reads the messages that have been storedon the database pursuant to batch offsets that have been defined bypublisher process and the consumer process. Upon reading of the messagesfrom the appropriate save store file, the publisher process 804 performscertain work on the messages and thereafter publishes those messages tothe messaging bus of a dedicated outbound messaging node 810 (e.g., anETX node) whereby they may be consumed by their intended consumers.

The notion of message batch offsets is graphically depicted in theenlarged “file system” box 808 situated, for clarity of illustration,between the consumer process 802 and the publisher process 804. Asinstructed by the consumer and publisher processes 802, 804, the filesystem 808 establishes save store file references including STARToffsets and END offsets for the save store files committed to thedatabase 812 managed by RDBMS 806. The consumer process 802 establishesthe END offset and moves the END offset along until a certain batch ofmessages has been written to a save store file. The consumer process 802writes an end offset to the RDBMS 806 after the last message in a batchhas been committed to a save store file. Similarly, the publisherprocess 804 writes a START offset to the RDBMS 806 for each messagebatch that it reads from a save store file. The publisher process neverreads any data before the START offset or after the END offset. Thus, adata “persist” is maintained at all times in the file system 808 wherebyeverything that is read by the publisher process 804 is transactionallyguaranteed by the consumer process 802. It will be understood that amessage batch may consist of as few as one message to as many as 1000 ormore messages, although a typical batch range according to the presentinvention is contemplated to be from about 50-100 messages.

As noted above, a routing system occasionally goes down for whateverreason and messages published to the system must be replayed. Withoutthe existence of the START and END offsets shown in FIG. 8, if messagesare written by the consumer process 802 to the database 812 and themessaging system is placed into recovery mode, the data placed in thedatabase at the time of recovery would be recognized by the consumerprocess 802 as being compromised. Accordingly, the consumer processwould republish all messages previously written in a batch to thedatabase which would produce duplication of messages previously writtenby the consumer process to the database. However, if the END offset isproperly recorded in the file system 808, then the messages written tothe database are transactionally committed by the inbound node 800 andduplicates of those messages will not be resent by the consumer process802 to the database upon recovery.

Similar to the manner in which the consumer process 802 moves the ENDoffset along before writing the END offset, the publisher process 804moves the START offset along before writing the START offset. That is,as it reads a batch of messages from a save store file, the publisherprocess 804 moves the START offset and writes a START offset to theRDBMS 806 for the last message read from the batch. If the START offsetis properly recorded in the database, then the publisher process willknow where to begin reading messages from the save store file inrecovery mode and will not publish duplicate messages.

Referring to FIG. 9, there is shown a detailed schematic of theconsumer-side work process performed by a consumer process (such as theconsumer process 802 of FIG. 8) in accordance with the furtherembodiment of the message transaction management scheme of the presentinvention. The consumer-side work process performs work on messages itconsumes from the message bus of a dedicated inbound node 800. Again,for purpose of illustration but not limitation, inbound node 800 isembodied as an ETX node, although it may be a communications node of anypresently known or hereinafter developed messaging system.

As generally reflected by Step 1 of FIG. 9, messages from node 800 maybe published on a source topic (e.g., Source Topic A) whereby they areconsumed by a consumer process via a dedicated ETX thread consuming onTopic A. This marks the beginning of a distributed transaction involvingthe resources of an RDBMS 806, database 812 and the inbound node messagebus. Together, RDBMS 806 and the associated database 812 manifest thefile system symbolized by reference numeral 808 of FIG. 8. That is, theRDBMS 906 is a configurational database that manages the save store filereferences, including the START and END offsets, for the save storefiles that are stored on database 812. It will be understood, especiallyby reference to FIG. 10 discussed below, that Source Topic A maycomprise messages on several topics, e.g., Topic B, Topic C, Topic D,etc., that are of interest to end consumers that have subscribed toconsume messages on one or more of those topics.

At Step 2 of FIG. 9, after the messages are consumed from the inboundnode 800, they are passed to a routing agent which builds the outboundmessages and identifies their endpoints. This process involves theexecution of one or more message handlers (described in greater detailin connection with FIG. 12) which may perform one or more of pre-routingtransformation, key extraction, key mapping lookup and post-routingtransformation. Outbound message endpoints are then acquired and anyrequisite endpoint transformations (again described in greater detail inconnection with FIG. 12) are performed. Depending on the work to beperformed on the messages, the steps of building outbound messages andidentifying their endpoints are iterated as necessary by the messagehandlers.

At Step 3 of FIG. 9, the outbound messages, their END offsets and theirendpoint destinations are persisted in the save store files and the savestore references of the file system 808 comprised of the database 812and the RDBMS 806. This process initially involves the identification ofthe appropriate endpoint transport for a message. This is followed bycreation of unique save store file(s) for the source topic, the endpointand the transport primary key (“PK”). At this time an index, preferablya timestamp representing the time of creation of a save store file, iscreated for each save store file and stored in database 812. An exampleof such an index is shown in FIG. 9 superimposed upon database 812 andidentified as “A.ETX.TmStamp.P.” Following this, the outbound messagesare written to the save store file(s) while the END offsets for themessages are correspondingly updated in order to persist thisinformation in the file system. Persistence of outbound messages isiterated as necessary for each of the endpoints for the messages.

The consumer process iterates each of the foregoing steps for eachmessage consumed from the message bus of the inbound node 800 dependingon the batch size, timeout range and save store file size(s).

At Step 4 of FIG. 9, the consumer process commits the distributedtransaction. It does this by storing the processed message batch in thedatabase 812 and by instructing the RDBMS to save the message ENDoffsets for the batch. As mentioned in connection with the discussion ofFIG. 8, proper storage of the END offsets for the messages in aparticular batch assures that no messages are republished by theconsumer process in the event message replay becomes necessary.

Referring to FIG. 10, there is shown a detailed schematic of thepublisher-side work process performed by a publisher process (such asthe publisher process 804 of FIG. 8) in accordance with the furtherembodiment of the message transaction management scheme of the presentinvention. At Step 1 of FIG. 10, the publisher process begins to readthe save store file(s) stored in database 812 via a dedicated ETXthread. At Step 2 of FIG. 10, the publisher process begins a save storefile access process for the save store file(s). This process involvesretrieving the first unpublished save store file and its associatedSTART/END offsets, executing a callback (“CB”) routine to update thelocal START offset (upper bound) and maintaining START offsetpersistence.

At Step 3 of FIG. 10, the publisher process opens the save store file byseeking lowest START offset for the file and then begins the publishingtransaction. The publishing transaction is begun by batch publishingfrom the save store file. Batch publishing is a function of the batchsize, maintenance of a START offset prior to the END offset for the fileand the end of file (“EOF”) command associated with the file. Thepublisher process then opens the topics (e.g., Topic B, Topic C, TopicD) on demand and publishes them to the dedicated outbound node 810(e.g., an ETX node). It also publishes the outbound messages to anunillustrated replay server having a database similar to database 732 ofFIG. 7. Again, for purpose of illustration but not limitation, outboundnode 810 is embodied as an ETX node, although it may be a communicationsnode of any presently known or hereinafter developed messaging system.

At Step 4 of FIG. 10, the publisher process commits the distributedtransaction. It does this by notifying the RDBMS 806 of the transmissionof the message batch to the message bus of the outbound node 810 and byinstructing the RDBMS to save the highest START offset for thetransmitted batch. As mentioned in connection with the discussion ofFIG. 8, storage of the highest START offset for a particular batchassures that no messages are republished by the publisher process in theevent message replay becomes necessary.

FIG. 11 is a simplified schematic diagram depicting the manner by whichthe routing system according to the present invention achieves fullyscalable multithreaded, multi-topic message consumption, processing andpublication. FIG. 11 reflects one of many possible implementations ofthe present routing system within an equities trading businessenterprise. It will also be understood that the system may beadvantageously deployed in any business or other enterprise that uses amessaging scheme over a computer network.

In FIG. 11, reference numeral 1100 generally indicates an instance ofthe routing system wherein a single consumer process C1 (correspondingto consumer process 802 of FIG. 8) communicates with two publisherprocesses P1 and P2 (each corresponding to publisher process 804 of FIG.8). According to the present invention, however, any number of consumerprocesses may communicate with any number of publisher processes. Asillustrated, messages are consumed by consumer process C1 from amessaging bus of a mainframe (“MF”) computer operating on an IBM MQseries messaging platform. After routing and other processing, thosemessages are ultimately published by publisher processes P1 and P2. Asshown, publisher process P1 is a distributed user that publishes themessages on an ETX series messaging platform and publisher process P2 isa distributed user that publishes the messages on an MQ series messagingplatform. It will be understood that consumer process C1 may be adistributed user and it may operate on a different messaging platformsuch as ETX. Similarly, publisher processes P1 and P2 may both publishon the same type of messaging platform.

According to the invention, each consumer process deals with only onemessaging transport and each publisher process deals with only onemessaging transport. That is, the number of consumer processes equalsthe number of inbound transports, and the number of publisher processesequals the number of outbound transports. An advantage of equating thenumber of consumer processes and publisher processes with theirrespective inbound and outbound transports is that the routing systemdoes not have to be concerned with transactionally coordinating workacross transports. Also, according to a preferred embodiment of theinvention, a formula exists for naming files whereby a part of the filename includes the associated transport for a file. In so doing, a clearseparation is maintained between transports and the files in which thetransport data resides. It would be more complex if a single publisherprocess were to read one file and then have to publish a given messagefrom that file to two different transports. Without a one-to-onecorrespondence between a publisher process and an outbound transport,publication to two or more disparate transactional transports would haveto be coordinated with a single row of navigational data in the RDBMS806. Such a situation can become quite complicated and requiresmessaging vendors to architect their products to be compatible with oneanother under XA Protocol, which is an industry standard relating todistributed transactions and the coordination of those transactions.

Further, each consumer process can run a consumer thread and eachpublisher process can run a publisher thread for each inboundtopic/queue. That is, the maximum number of consumer threads equals thenumber of inbound topics/queues and the maximum number of publisherthreads equals the number of inbound topics/queues. For simplicity, twosuch inbound topics/queues are shown in FIG. 11 and are identified as T1and T2 (although any number of inbound topics/queues may beaccommodated). By way of example, topic/queue T1 relates to trademessages and topic/queue T2 relates to journal messages.

As described in greater detail in regard to FIGS. 8-10 and 12, consumerprocess C1 includes a message handler that performs routing and messagetransformation that may be necessary to cause it to write the inboundmessages to the publisher processes P1 and P2 via save store files.According to the invention, the number of save store files equals thenumber of inbound topics/queues times the number of outbound transports.In the present example, therefore, four save store files are created,i.e., files F1.Trades.MQ, F2.Trades.ETX, F3.Journals.MQ andF4.Journals.ETX, because two topics/queues T1 and T2 are handled by thetwo outbound messaging transports that service the publisher processesP1 and P2.

The real-time message processing demands of largegeographically-distributed businesses are substantial and continuouslygrowing. In global securities trading businesses these demands areimmense. As mentioned previously, presently available single-threadedmessaging systems can accommodate a real-time data flow of about 35messages per second (assuming an average message size of two kilobytes).in a large stock trading system, a real-time flow of data easily exceeds35 messages per second. Using the present routing system, multiplethreads of the system can be instantiated on single or multiple machineswhereby topics/queues may be split among the multiple threads tooptimize the number of threads needed to accommodate high volume messagethroughput in real time. Indeed, the present multithreaded system iscapable of processing at least 100 logical units of work per second andtherefore finds beneficial application in enterprises where real-timemessage processing demands are greatest.

Message Transformation and Transport Transformation

The message handler of the routing system of the present invention is anextensible piece of code, and plug-ins can be utilized to expand itsfunctionality. This concept is particularly relevant when dealing with avariety of message formats. Because a router is only as intelligent asit is programmed to be, it needs to be able to process messages thatenter and exit the router in different and changing formats.

Through cooperative efforts of publishers and consumers in the intendedcommunication space, business logic is programmed into the router of thepresent invention by configuring the routing rules and introspectionmodule. The specific information the router is looking for in a messageis provided by the introspection module (a part of a logical unit ofwork which also does optional mapping of the routing keys to routingtarget(s) using a mapping table and makes routing decisions based on therouting target(s)).

A message can also be transformed as part of the application of complexrouting logic. In such circumstances, the router may pass the message toa customer plug-in that transforms the message and returns the messageto the router in the new format. Because such transformation is calledfor by the user, the user's routing logic needs to be aware of theformat of the message to be processed. It is possible for a message tobe evaluated against a first rule in one format, and evaluated against asecond rule in a different format. To guard against an error condition,the explosion module of the second rule would need to be aware that themessage is in a different format than that used in applying the firstrule.

FIG. 12 provides an overview of the message routing and transformationfunctions of the routing system according to the present invention. Asseen in that figure, a message handler performs routing and messagetransformation. As described above, routing typically includes keyextraction and key mapping lookup. Message transformation may involvepre-routing transformation and post-routing transformation. Inpre-routing transformation, a message is transformed or rules areapplied to the message before routing in order to, for example,transform the message into a desired format that is understandable bythe endpoint consumer(s) of the message. The consumer, in turn, suppliesthe tags necessary to enable the router to then perform routing of thetransformed message. In post-routing transformation, a message is firstrouted and then is transformed by the router prior to consumption by theend consumer. Endpoint transformations are transformations thatheretofore have been performed by endpoint subscribers in order toconsume outbound messages following routing.

Endpoint subscribers may instruct the routing system of the presentinvention to perform message transformation based on a certainpublishing topic name. According to the present invention, once themessage transformation requirements for such a transformation are madeknown to the present routing system, the message handler can perform thenecessary transformation as part of its message handling procedure.

It also possible for endpoint users of the system that desire to consumemessages in formats previously unrecognized by the routing system of thepresent invention to instruct the system to perform messagetransformation on messages so that they can be consumed by the endpointusers in the new formats. As reflected in FIG. 12, such messagetransformation may generally be referred to as endpoint transformation.For example, a producer or publisher of information may be publishinginformation in a proprietary format and two target systems may belistening to the router, wherein one of the listeners may be a legacysystem that can consume the information in the proprietary format andthe other listener may be a new system that can consume information onlyin a different or new format. With the concept of endpointtransformation, an end user or target listening to the router in apreviously unrecognized format can cause the present routing system toperform post-routing transformation on future messages based on theneeds of the new listener system.

The foregoing is especially useful for migrating the endpointtransformations of new listeners into message transformations that canbe performed directly by the message handler. That is, when the commonendpoint transformation procedures of a new group of target instances orendpoint subscribers are identified, the endpoint transformationsformerly performed by those new target instances become post-routingtransformations that can be automatically performed by the messagehandler when all new users that consume messages in the new format(s)have made the system aware of their need to consume messages in the newformat(s).

Conversely, similar to the way in which the present routing system maymigrate new endpoint transformations into the routing system aspost-routing transformations, it may also be used to migrate from old,obsolete or otherwise undesirable publisher and listener messagingformats. That is, when a messaging format falls into disfavor as astandard messaging format or is used by a decreasing number of listenersin a messaging system that employs the present routing system, therouting system may be easily configured to migrate from the unwantedmessaging format.

The present routing system also caches and maintains metadata on arule-by-rule basis whereby end applications may continuously revise themetadata. For example, a mapping operation may be configured to be partof a particular message handler. Accordingly, the mapping tableinformation will be loaded (cached) into process memory at the processinitialization state. If an end application indicates to the system thatthe data associated with a particular keymap is stale, the endapplication can instruct the system to update that data. In order tohandle the data update request all routing will be paused and a specialroutine (usually provided by the end user) will be called to reload themapping information from some resource external to the end user source(e.g., a file or a database).

The present routing system is thus able to readily update its existingrouting functions, incorporate new message transformations and messageformats, and migrate from undesirable message transformations andmessage formats. Consequently, the present system is capable ofperforming highly complex routing/transformation functions and isextremely adaptable to an enterprise's evolving messaging needs.

It will be understood that the embodiments of the invention describedherein are merely exemplary and that a person skilled in the art maymake many variations and modifications without departing from the spiritand scope of the present invention. All such variations andmodifications are intended to be included within the scope of theinvention as defined in the appended claims.

1. A computerized message routing system comprising: (a) router means,said router means including means for consuming messages from apublisher, means for publishing the messages to at least one subscriber,and means for publishing the messages to a replay server; and (b) areplay server for storing all messages published by said router meansand for republishing certain ones of the messages to a subscriber ondemand of the subscriber.
 2. The system of claim 1 wherein said replayserver republishes messages directly to the subscriber.
 3. The system ofclaim 1 wherein said replay server republishes messages to said routermeans for delivery by said router means to the subscriber.
 4. A methodfor recovering messages that fail to reach their intended destinationsin a computerized message routing system, the method comprising thesteps of: (a) storing all messages published by a router on a replayserver; and (b) republishing certain ones of the messages from thereplay server to a subscriber on demand of the subscriber.
 5. The methodof claim 4 wherein step (b) comprises republishing certain ones of themessages by the replay server directly to a subscriber.
 6. The method ofclaim 4 wherein step (b) comprises republishing certain ones of themessages by the replay server to the router for delivery by the routerto the subscriber.
 7. The method of claim 4 further comprising encodingthe messages republished by the replay server such that the subscriberrecognizes the messages as republished rather than originally publishedmessages.
 8. A computerized message routing system comprising: consumerprocess means for consuming messages from a publisher and for writingthe messages to at least one file; publisher process means for readingmessages that have been written by said consumer process means to saidat least one file and for publishing the messages to at least onesubscriber; and a file system in communication with said consumerprocess means and said publisher process means, said file systemcomprising: said at least one file, wherein said at least one filestores messages written from said consumer process means in batches; andmeans for maintaining write and read offsets for message batches thatare written to said at least one file by said consumer process means andthat are read from said at least one file by said publisher processmeans, whereby the write and read offsets enable data to be persisted insaid at least one file such that duplicate messages are not written bysaid consumer process means to said at least one file or published bysaid publisher process means to the at least one subscriber in the eventmessage recovery is required.
 9. The system of claim 8 wherein the writeand read offsets include: an END offset written by said consumer processmeans to said means for maintaining offsets for a batch of messagesstored in said at least one file; and a START offset written by saidpublisher process means to said means for maintaining offsets for abatch of messages read from said at least one file.
 10. The system ofclaim 9 wherein said START offset precedes said END offset for a batchof messages.
 11. A method for preventing duplicate publication of datain a computerized message routing system comprising: consuming messagesfrom a publisher and writing the messages in batches to at least onefile; reading messages from the at least one file and publishing themessages to at least one subscriber; and maintaining write and readoffsets for message batches that are written to and read from the atleast one file, whereby the write and read offsets enable data to bepersisted in the at least one file such that duplicate messages cannotbe written to the at least one file or published to the at least onesubscriber in the event message recovery is required.
 12. The method ofclaim 11 wherein the step of maintaining write and read offsetsincludes: writing an END offset for a batch of messages stored in saidat least one file; and writing a START offset for a batch of messagesread from said at least one file.
 13. The method of claim 11 wherein thesaid START offset precedes said END offset for a batch of messages. 14.A method for expanding the messaging processing capability of acomputerized message routing system comprising a message handler thatperforms routing of messages from a publisher to endpoint subscribers,pre-routing transformation of the messages prior to routing of themessages to the endpoint subscribers and post-routing transformation ofthe messages after routing of the messages to the endpoint subscribers,said method comprising the steps of: providing the message handler withendpoint message transformation procedures performed by a new group ofendpoint subscribers that desire to receive messages in a formatpreviously unrecognized by the message handler; and when all members ofthe new group of endpoint subscribers have made their endpoint messagetransformation procedures known to the message handler, automaticallyperforming by the message handler the endpoint message transformationprocedures formerly performed by the new group of endpoint subscribersas post-routing message transformation and delivering messages to thenew group of endpoint subscribers in the format previously unrecognizedby the message handler.
 15. A computerized message routing systemcomprising; at least one inbound transport in communication with atleast one consumer process that is operable to run at least one consumerthread for each inbound message topic; and at least one outboundtransport in communication with at least one publisher process that isoperable to run at least one publisher thread for each inbound messagetopic, wherein the at least one consumer process communicates with theat least one publisher process via at least one message file, whereinthe number of consumer processes equals the number of inboundtransports, wherein the number of publisher processes equals the numberof outbound transports, wherein the maximum number of consumer threadsequals the number of inbound message topics, wherein the maximum numberof publisher threads equals the number of inbound message topics, andwherein the number of message files equals the number of inbound topicstimes the number of outbound transports.
 16. A method for operating acomputerized message routing system, said method comprising the stepsof: providing at least one inbound transport in communication with atleast one consumer process that is operable to run at least one consumerthread for each inbound message topic; providing at least one outboundtransport in communication with at least one publisher process that isoperable to run at least one publisher thread for each inbound messagetopic; and communicating the at least one consumer process with the atleast one publisher process via at least one message file, wherein thenumber of consumer processes equals the number of inbound messagetransports, wherein the number of publisher processes equals the numberof outbound message transports, wherein the maximum number of consumerthreads equals the number of inbound message topics, wherein the maximumnumber of publisher threads equals the number of inbound message topics,and wherein the number of message files equals the number of inboundtopics times the number of outbound transports.
 17. A method foroperating a computerized message routing system, said method comprisingthe steps of: (a) consuming a message from a message bus of an inboundmessaging node; (b) invoking an introspection module based on a subjecton which the message has been published to the inbound node; (c)examining the contents of the message; (d) extracting at least onerouting key from the message based on the contents of the message; (e)examining the at least one routing key; (f) identifying a routing tagbased on the at least one routing key; (g) evaluating the routing tag todetermine whether the routing tag is bound to one or both of an outboundsubject and a routing rule; and, either (h) if the routing tag is boundto an outbound subject, then publishing the message to a message bus ofan outbound messaging node, or (i) if the routing tag is bound to arouting rule or a routing rule and an outbound subject, then extractingat least one routing key based on the routing rule and repeating steps(g), (h) and (i) until the message is published to a message bus of anoutbound messaging node.
 18. A computerized message routing systemcomprising: (a) router means, said router means including: (i) consumerprocess means for consuming messages from a publisher and for writingthe messages to at least one file, and (ii) publisher process means forreading messages from said at least one file that have been written bysaid consumer process means to said at least one file, for publishingthe messages to at least one subscriber and for publishing the messagesto a replay server; (iii) a file system in communication with saidconsumer process means and said publisher process means, said filesystem comprising: said at least one file, wherein said at least onefile stores messages written from said consumer process means inbatches; and means for maintaining write and read offsets for messagebatches that are written to said at least one file by said consumerprocess means and that are read from said at least one file by saidpublisher process means, whereby the write and read offsets enable datato be persisted in said at least one file such that duplicate messagesare not written by said consumer process means to said at least one fileor published by said publisher process means to the at least onesubscriber in the event message recovery is required; and (b) a replayserver for storing all messages published by said publisher processmeans and for republishing certain ones of the messages to a subscriberon demand of the subscriber.
 19. The system of claim 18 wherein saidreplay server republishes messages directly to the subscriber.
 20. Thesystem of claim 18 wherein said replay server republishes messages tosaid router means for delivery by said router means to the subscriber.21. The system of claim 18 wherein the write and read offsets include:an END offset written by said consumer process means to said means formaintaining offsets for a batch of messages stored in said at least onefile; and a START offset written by said publisher process means to saidmeans for maintaining offsets for a batch of messages read from said atleast one file.
 22. The system of claim 21 wherein said START offsetprecedes said END offset for a batch of messages.
 23. A computerizedmessage routing system comprising: (a) router means, said router meansincluding: (i) at least one consumer process means for consumingmessages from a publisher and for writing the messages to at least onefile; (ii) at least one publisher process means for reading messagesfrom said at least one file, for publishing the messages to at least onesubscriber, and for publishing the messages to a replay server; (iii) afile system in communication with said at least one consumer processmeans and said at least one publisher process means, said file systemcomprising: said at least one file, wherein said at least one filestores messages written from said at least one consumer process means inbatches; and (b) means for maintaining write and read offsets formessage batches that are written to said at least one file by said atleast one consumer process means and that are read from said at leastone file by said at least one publisher process means, whereby the writeand read offsets enable data to be persisted in said at least one filesuch that duplicate messages are not written by said at least oneconsumer process means to said at least one file or published by said atleast one publisher process means to the at least one subscriber in theevent message recovery is required; (c) at least one inbound transportin communication with said at least one consumer process means, whereinsaid at least one consumer process means is operable to run at least oneconsumer thread for each inbound message topic; and (d) at least oneoutbound transport in communication with said at least one saidpublisher process means wherein said at least one publisher processmeans is operable to run at least one publisher thread for each inboundmessage topic, wherein said at least one consumer process meanscommunicates with said at least one publisher process means via said atleast one file, wherein the number of consumer processes means equalsthe number of inbound transports, wherein the number of publisherprocesses means equals the number of outbound transports, wherein themaximum number of consumer threads equals the number of inbound messagetopics, wherein the maximum number of publisher threads equals thenumber of inbound message topics, and wherein the number of said atleast one file equals the number of inbound topics times the number ofoutbound transports.
 24. The system of claim 23 wherein the write andread offsets include: an END offset written by said at least oneconsumer process means to said means for maintaining offsets for a batchof messages stored in said at least one file; and a START offset writtenby said at least one publisher process means to said means formaintaining offsets for a batch of messages read from said at least onefile.
 25. The system of claim 24 wherein said START offset precedes saidEND offset for a batch of messages.
 26. A computerized message routingsystem comprising; (a) at least one inbound transport in communicationwith at least one consumer process that is operable to run at least oneconsumer thread for each inbound message topic; and (b) at least oneoutbound transport in communication with at least one publisher process,said at least one publisher process being operable to run at least onepublisher thread for each inbound message topic and to publish messagesto at least one subscriber and to a replay server, wherein the at leastone consumer process communicates with the at least one publisherprocess via at least one message file, wherein the number of consumerprocesses equals the number of inbound transports, wherein the number ofpublisher processes equals the number of outbound transports, whereinthe maximum number of consumer threads equals the number of inboundmessage topics, wherein the maximum number of publisher threads equalsthe number of inbound message topics, and wherein the number of messagefiles equals the number of inbound topics times the number of outboundtransports; and (c) a replay server for storing all messages publishedby said at least one publisher process and for republishing certain onesof the messages to a subscriber on demand of the subscriber.
 27. Thesystem of claim 26 wherein said replay server republishes messagesdirectly to the subscriber.
 28. The system of claim 26 wherein saidreplay server republishes messages to said router means for delivery bysaid router means to the subscriber.
 29. A computerized message routingsystem comprising: (a) router means, said router means including: (i) atleast one consumer process means for consuming messages from a publisherand for writing the messages to at least one file, and (ii) at least onepublisher process means for reading messages from said at least one filethat have been written by said consumer process means to said at leastone file, for publishing the messages to at least one subscriber and forpublishing the messages to a replay server; (iii) a file system incommunication with said at least one consumer process means and said atleast one publisher process means, said file system comprising: said atleast one file, wherein said at least one file stores messages writtenfrom said at least one consumer process means in batches; and means formaintaining write and read offsets for message batches that are writtento said at least one file by said at least one consumer process meansand that are read from said at least one file by said at least onepublisher process means, whereby the write and read offsets enable datato be persisted in said at least one file such that duplicate messagesare not written by said at least one consumer process means to said atleast one file or published by said at least one publisher process meansto the at least one subscriber in the event message recovery isrequired; (b) at least one inbound transport in communication with saidat least one consumer process means, wherein said at least one consumerprocess means is operable to run at least one consumer thread for eachinbound message topic; (c) at least one outbound transport incommunication with said at least one said publisher process meanswherein said at least one publisher process means is operable to run atleast one publisher thread for each inbound message topic, wherein saidat least one consumer process means communicates with said at least onepublisher process means via said at least one file, wherein the numberof consumer processes means equals the number of inbound transports,wherein the number of publisher processes means equals the number ofoutbound transports, wherein the maximum number of consumer threadsequals the number of inbound message topics, wherein the maximum numberof publisher threads equals the number of inbound message topics, andwherein the number of said at least one file equals the number ofinbound topics times the number of outbound transports; and (d) a replayserver for storing all messages published by said at least one publisherprocess means and for republishing certain ones of the messages to asubscriber on demand of the subscriber.
 30. The system of claim 29wherein the write and read offsets include: an END offset written bysaid at least one consumer process means to said means for maintainingoffsets for a batch of messages stored in said at least one file; and aSTART offset written by said at least one publisher process means tosaid means for maintaining offsets for a batch of messages read fromsaid at least one file.
 31. The system of claim 30 wherein said STARToffset precedes said END offset for a batch of messages.
 32. The systemof claim 29 wherein said replay server republishes messages directly tothe subscriber.
 33. The system of claim 29 wherein said replay serverrepublishes messages to said router means for delivery by said routermeans to the subscriber.